Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules

Identifieur interne : 004F12 ( Main/Exploration ); précédent : 004F11; suivant : 004F13

Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules

Auteurs : Martine Cadot [France]

Source :

RBID : Hal:tel-00594174

Descripteurs français

English descriptors

Abstract

This thesis is about of Data Mining in Humanistic. This branch of Artificial Intelligence is a set of methods for extracting knowledge from electronic data. Among them, the itemsets and association rules extraction is a method to build a symbolic representation of the data structure, like the classical statistical methods makes, but, unlike these ones, it can work with complex and huge data. Therefore, this computer science model, obtained by counting of cooccurrences, is not easily used by scientists : it works with dichotomics data (True/False), the interpretation of its direct results is difficult, and its validity can seem of doubt for researchers working with statistics. We propose three techniques we constructed and experimented on real data to facilitate the use of the itemsets and association rules extraction by scientists : 1) With our randomisation test based on " exchanges in cascade " in the matrix subjects x properties, one can obtain the statistically significant links between properties 2) Our fuzzification of the itemsets and association rules extraction produces fuzzy association rules close to the fuzzy rules defined by researchers of fuzzy community around Zadeh 3) With our algorithm Midova one can only extract interactions, and 4) With our meta-rules, one can clean the association rules set of its principal contradictions and redundancies

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules</title>
<title xml:lang="fr">Extraire et valider les relations complexes en sciences humaines : statistiques, motifs et règles d'association</title>
<author>
<name sortKey="Cadot, Martine" sort="Cadot, Martine" uniqKey="Cadot M" first="Martine" last="Cadot">Martine Cadot</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-31478" status="OLD">
<idno type="IdRef">168612127</idno>
<idno type="ISNI">0000 0001 2193 4396</idno>
<idno type="RNSR">199613836L</idno>
<orgName>Laboratoire de Semio-Linguistique, Didactique et Informatique</orgName>
<orgName type="acronym">LASELDI</orgName>
<desc>
<address>
<addrLine>30 rue Mégevand 25030 Besançon cedex </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr/pages/fr/ea-2281---laseldi-7966.html</ref>
</desc>
<listRelation>
<relation name="EA2281" active="#struct-242365" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle name="EA2281" active="#struct-242365" type="direct">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:tel-00594174</idno>
<idno type="halId">tel-00594174</idno>
<idno type="halUri">https://tel.archives-ouvertes.fr/tel-00594174</idno>
<idno type="url">https://tel.archives-ouvertes.fr/tel-00594174</idno>
<date when="2006-12-12">2006-12-12</date>
<idno type="wicri:Area/Hal/Corpus">002186</idno>
<idno type="wicri:Area/Hal/Curation">002186</idno>
<idno type="wicri:Area/Hal/Checkpoint">003E24</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">003E24</idno>
<idno type="wicri:Area/Main/Merge">005077</idno>
<idno type="wicri:Area/Main/Curation">004F12</idno>
<idno type="wicri:Area/Main/Exploration">004F12</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules</title>
<title xml:lang="fr">Extraire et valider les relations complexes en sciences humaines : statistiques, motifs et règles d'association</title>
<author>
<name sortKey="Cadot, Martine" sort="Cadot, Martine" uniqKey="Cadot M" first="Martine" last="Cadot">Martine Cadot</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-31478" status="OLD">
<idno type="IdRef">168612127</idno>
<idno type="ISNI">0000 0001 2193 4396</idno>
<idno type="RNSR">199613836L</idno>
<orgName>Laboratoire de Semio-Linguistique, Didactique et Informatique</orgName>
<orgName type="acronym">LASELDI</orgName>
<desc>
<address>
<addrLine>30 rue Mégevand 25030 Besançon cedex </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr/pages/fr/ea-2281---laseldi-7966.html</ref>
</desc>
<listRelation>
<relation name="EA2281" active="#struct-242365" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle name="EA2281" active="#struct-242365" type="direct">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>Association Rules</term>
<term>Data Cleaning and Preprocessing</term>
<term>Data Mining</term>
<term>Fuzzy Itemsets</term>
<term>Fuzzy rules</term>
<term>Itemsets</term>
<term>Knowledge Discovery</term>
<term>Machine Learning</term>
<term>Randomisation Test</term>
<term>Statistical Interaction</term>
<term>Statistical Significance</term>
<term>Text Mining</term>
</keywords>
<keywords scheme="mix" xml:lang="fr">
<term>apprentissage artificiel</term>
<term>codage et recodage des données.</term>
<term>extraction de connaissances</term>
<term>fouille de données</term>
<term>fouille de textes</term>
<term>interaction statistique</term>
<term>motifs</term>
<term>motifs flous</term>
<term>nettoyage et prétraitement des données</term>
<term>règles d'association</term>
<term>règles floues</term>
<term>significativité statistique</term>
<term>test de randomisation</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This thesis is about of Data Mining in Humanistic. This branch of Artificial Intelligence is a set of methods for extracting knowledge from electronic data. Among them, the itemsets and association rules extraction is a method to build a symbolic representation of the data structure, like the classical statistical methods makes, but, unlike these ones, it can work with complex and huge data. Therefore, this computer science model, obtained by counting of cooccurrences, is not easily used by scientists : it works with dichotomics data (True/False), the interpretation of its direct results is difficult, and its validity can seem of doubt for researchers working with statistics. We propose three techniques we constructed and experimented on real data to facilitate the use of the itemsets and association rules extraction by scientists : 1) With our randomisation test based on " exchanges in cascade " in the matrix subjects x properties, one can obtain the statistically significant links between properties 2) Our fuzzification of the itemsets and association rules extraction produces fuzzy association rules close to the fuzzy rules defined by researchers of fuzzy community around Zadeh 3) With our algorithm Midova one can only extract interactions, and 4) With our meta-rules, one can clean the association rules set of its principal contradictions and redundancies</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Franche-Comté</li>
</region>
<settlement>
<li>Besançon</li>
</settlement>
<orgName>
<li>Université de Bourgogne Franche-Comté</li>
<li>Université de Franche-Comté</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Franche-Comté">
<name sortKey="Cadot, Martine" sort="Cadot, Martine" uniqKey="Cadot M" first="Martine" last="Cadot">Martine Cadot</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 004F12 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 004F12 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Hal:tel-00594174
   |texte=   Extraction of Complex Relations in Humanistic : Statistics, Itemsets and Association Rules
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022